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PREFACE 


The  mission  of  the  Intelligent  Training  Branch  of  the 
Technical  Training  Research  Division  of  the  Hiiman  Resources 
Directorate  of  the  Armstrong  Laboratory  (AL/HRTI)  is  to 
design,  develop,  and  evaluate  the  application  of  artificial 
intelligence  (AI)  technologies  to  computer-assisted  training 
systems.  The  current  effort  was  undertaken  as  part  of 
HRTI's  research  on  intelligent  tutoring  systems  (ITS)  and 
ITS  development  tools.  The  work  was  accomplished  under 
work  unit  1121-09-81,  Application  of  Artificial  Neural 
Networks  to  Modeling  Student  Performance.  The  proposal  for 
this  research  was  solicited  using  a  Broad  Agency 
Announcement . 
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Probabilistic  Student  Modeling  with  Knowledge  Space  Theory' 


Michael  Villano  and  Qiarles  Bloom 


Honeywell  Sensor  and  ^tem  Development  Center 
3660  Technology  Drive 
Miimeapolis,  MN  55418 


Abstract 

This  article  presents  Knowledge  Space  Theory  (Falmagne  and  Doignon)  as  the 
foundation  for  a  probabilistic  student  model  to  be  imbedded  in  an  Intelligent 
Tutoring  System  (ITS).  Applications  to  t5rpical  ITS  student  modeling  issues  such  as 
knowledge  representation,  adaptive  assessment,  curriculum  representation, 
advancement  criteria,  and  student  feedback  are  discussed.  Several  factors  contribute 
to  uncertainty  in  student  modeling  such  as  careless  errors  and  lucky  guesses, 
learning  and  forgetting,  and  unanticipated  student  response  patterns.  However,  a 
probabilistic  student  model  can  represent  uncertainty  regarding  the  estimate  of  the 
student's  knowledge  and  can  be  tested  using  empirical  student  data  and  established 
statistical  techniques. 


Introduction 

The  student  model  in  an  Intelligent  Tutoring  System  provides  support  for  the 
following  functions:  adaptively  assessing  the  student's  mastery  of  the  course 
material,  representing  the  student's  progress  through  the  curriculum,  selecting  the 
appropriate  level  of  hinting  and  explanation,  determining  advancement  and 
facilitating  student  feedback.  Ideally,  the  student  model  should  maintain  as  much 
information  about  the  student's  knowledge  as  is  necessary  to  meet  the  demands  of 
the  ITS.  In  addition  to  dynamically  adapting  to  new  information  obtained  from  the 
student's  responses  during  an  individual's  interaction  with  foe  ITS,  foe  student 
model  should  also  be  capable  of  utilizing  assessment  experience  obtained  from  a 
population  of  students.  The  motivation  for  a  probabilistic  student  model  stems 
from  foe  need  to  represent  uncertainty  regarding  foe  estimate  of  foe  student's 
knowledge.  Sevo’al  factors  contribute  to  uncertainty  in  student  modeling  such  as 
careless  errors  and  lucky  guesses  in  foe  student's  responses,  change  in  the  student 
knowledge  due  to  learning  and  forgetting,  and  patterns  of  student  responses  simply 
unanticipated  by  foe  designer  of  foe  student  modeL 

The  purpose  of  titis  paper  is  to  consider  foe  application  of  Knowledge  Space 
Theory  (Falmagne  and  Doignon)  as  a  probabilistic  student  modd  imbedded  in  an 
Intelligent  Tutoring  System  (ITS).  An  ITS  for  high  school  mathematics  is  in  foe 
planning  stages  using  K5T  (Falmagne,  personal  communication,  November  1991). 


'This  reaoarch  was  supported  in  part  by  Armstrong  Labontoty,  Human  Systems  Division  (AFSO, 
Ikiited  States  Air  Fofce;  BRxdcs  AFB,  IX  7B235-5600  under  oontiBCt  number  F33615-91-C-00Q2. 
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Probabilistic  Student  Model 

Knowledge  Space  Theory  was  developed  primarily  for  adaptive,  computerized 
knowledge  assessment.  Therefore,  some  conjecture  will  be  necessary  to  infer  how 
the  various  functions  of  a  student  model  could  be  handled  by  this  theory. 

Knowledge  Representation  in  the  Student  Model 

Knowledge  Space  Theory 

A  comprehensive  theory  of  knowledge  representation  and  assessment  has  been 
developed  by  Falmagne,  Doignon  and  their  associates  (Falmagne  and  Doignon,  1985; 
Falmagne,  Koppen,  Villano,  Doignon  and  Johannesen,  1990).  In  their  Knowledge 
Space  Theory  (KST),  the  basic  unit  of  knowledge  is  an  item.  Each  item  can  be  in  the 
form  of  a  problem  or  an  equivalence  class  of  problems  that  the  student  has  to  solve. 
An  item  may  also  be  presented  as  a  task  which  the  student  has  to  perform  if  the  goal 
is  to  assess  procedural  knowledge.  Thus,  a  body  of  knowledge  is  characterized  by  a 
set  of  items  called  the  domain.  The  following  items  will  be  used  as  examples 
throughout  the  text: 


a.4x7  =  ?  b.l/4xl/7=?  c0.4x7  =  ?  d.40%of7=? 

The  student*  s  knowledge  state  is  defined  as  the  collection  of  items  the  student  is 
capable  of  solving.  For  example,  the  knowledge  state  {a,  b,  d}  corresponds  to  a 
student  who  can  solve  Items  a,  b  and  d  but  who  could  not  solve  Item  c.  Not  all 
subsets  of  items  are  considered  to  be  feasible  states.  For  example,  if  a  student  is 
capable  of  solving  the  percentage  problem  (Item  d)  then  we  may  be  able  to  infer  that 
the  student  can  perform  single-digit  multiplication  (Item  a)  and  thus,  any  state  that 
contained  Item  d  would  contain  Item  a.  We  also  might  not  expect  to  find  a  student 
who  could  answer  Item  d  and  none  of  the  other  items,  thus  {d}  would  not  be 
considered  a  feasible  state.  The  collection  of  all  feasible  states  is  caUed  the  knowledge 
structure.  A  knowledge  structure  must  contain  the  null  state  { )  which  corresponds 
to  the  student  who  fails  all  the  items,  and  the  domain  whidi  corresponds  to  the 
student  who  has  mastered  all  the  items.  An  example  knowledge  structure  for  the 
four  items  a,  b,  c,  d  appears  in  Hgure  1. 


^(a.b.d}  ^ 

/{..bl^  ^ 

0  —  {«)  ^{«.b,c) — {a.b.c,d) 

^  V  ✓ 

>>  (a.c.d!  ^ 

Figure  1.  Example  knowledge  structure. 


An  important  special  case  of  a  knowledge  structure  occius  when  the  collection 
of  states  is  dosed  imder  union.  That  is:  if  two  subsets  of  items  are  states  in  the 
knowledge  structure  then  their  union  is  also  a  state.  A  knowledge  structure 
satisfying  this  condition  is  called  a  knowledge  space.  In  Figiue  1,  notice  that 
(a,  b,  d)  u  (a,  q,  d) « (a,  b,  c,  d)^  which  is  a  state  in  the  knowledge  space.  An  additionid 
and  stronger  condition  on  a  knowledm  space  involves  the  assumption  that  any 
knowledge  state  is  on  a  beaming  pam,'  consisting  in  an  increasing  sequ^ce  of 
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states.  Beginning  with  the  null  state  and  finishing  with  the  full  set  of  items,  any 
state  in  the  path  different  from  the  null  state  contains  exactly  one  more  problem 
than  the  preceding  state.  Such  a  learning  path  is  called  a  gradation.  The  following 
chain  of  states  illustrates  one  of  the  gradations  from  the  knowledge  space  in  Figure 
1:  { 1  c  {a}  c  {a,  b)  c  {a,  b,  d}  c  {a,  b,  c,  d}.  if  any  state  of  the  knowledge  structure  is 
contained  in  at  least  one  gradation,  the  knowledge  space  is  said  to  be  well-graded. 
The  knowledge  space  in  Figure  1  is  well-graded.  The  four  gradations  can  be 
represented  by  the  corresponding  order  in  which  the  items  can  be  mastered:  abdc, 
abed,  acbd,  and  aedb. 

There  are  two  concerns  to  be  addressed  regarding  storage  issues  for  this  form  of 
knowledge  representation.  For  n  items,  there  are  2”  possible  knowledge  states. 
However,  in  the  example,  there  are  only  8  out  of  16  possible  states  in  the  knowledge 
space.  There  are  n\  possible  gradations,  but  in  the  example  there  are  only  4  out  of  24 
possible  gradations.  In  practice,  there  cu-e  much  fewer  states  (and  gradations)  than 
the  theoretical  maximum.  In  the  simplest  case,  if  there  is  a  simple  order  of  the  items 
(a  Guttman-scale)  yielding  a  single  gradation  (learning  path)  through  the  items, 
then  there  are  only  n+2  feasible  states.  In  a  study  involving  50  items  in  high  school 
mathematics,  the  size  of  the  knowledge  spaces  obtained  from  systematically 
querying  experts  ranged  from  900  to  around  8,000  states— roughly  die  same  order  of 
magnitude  across  experts  (Kambouri,  Koppen,  Villano,  &  Falmagne,  1991).  The  sizes 
of  these  knowledge  space  are  far  less  than  the  theoretical  maximum  of  2^  «>  IQi^. 

The  knowledge  space  forms  the  core  of  a  knowledge  assessment  system.  The 
goal  of  a  knowledge  assessment  system  is  to  locate,  as  efficiently  and  accurately  as 
possible,  a  student's  knowledge  state  in  the  knowledge  structure.  Stochastic 
knowledge  assessment  routines  have  been  developed  in  which  uncertainty 
regarding  the  student's  knowledge  state  is  represented  by  a  probability  distribution 
on  the  states  (Falmagne  &  Doignon,  1988;  Villano,  1991).  To  each  state  K  in  a 
knowledge  structure  K,  we  assign  a  probability  PiK).  The  assessment  routine  upxlates 
the  probability  distribution  on  the  states  to  be  consistent  with  the  student's 
responses  to  a  carefully  chosen  sequence  of  items.  From  die  probability  distribution 
on  die  states,  we  can  also  compute  die  probability  of  correct  response  to  an  item  p{q) 

BS 

p(<?)«  I  PiK) 

Ke 

where  is  the  set  of  states  which  contain  item  q.  The  probability  of  an  incorrect 
response  is  1  >  p(q).  Item  parameters  which  can  be  estimated  from  stochastic 
learning  models  (Falmagne,  1989a;  Villano,  1991;  Falmagne,  1991a;  Falmagne,  1991b) 
applied  to  empirical  student  data  include  the  probability  of  a  cardess  erren'  and  die 
pr^bility  of  a  lucky  guess. 

Student  Model  Construction 

The  two  basic  steps  involved  in  the  construction  of  a  probabilistic  student  model  are 
1)  buildii^  the  structural  relationships  among  the  items  and  2)  determining  the 
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initial  values  for  the  probabilities  in  the  models.  Both  of  these  steps  rely  on  the 
judgments  of  experts  or  require  empirical  data  from  a  population  of  students. 


Building  the  Structural  Model 

Several  methods  for  building  the  knowledge  structure  in  KST  have  been  explored: 

1)  Expert  Judgments  -  an  application  of  the  QUERY  Routine  has  been  carried 
out  by  Kambouri,  Koppen,  Villano  and  Falmagne  (1991).  QUERY  (Koppen  and 
Doignon,  1990)  is  a  computerized  procedure  designed  to  systematically  question  an 
experienced  teacher/tutor  and  obtain  the  expert's  "personal"  knowledge  structure. 
The  results  obtained  for  50  items  in  high  school  mathematics  revealed  that  the 
procedure  could  be  applied  in  a  realistic  setting.  A^eement  among  the  experts  was 
obtained  on  gross  measures  such  as  item  difficulty,  the  relative  size  of  the  structures 
and  the  correlation  of  experts'  responses  given  to  the  same  questions  posed  by  the 
routine.  The  limitations  of  this  approach  include  a  lack  of  agreement  between  the 
states  which  make  up  the  experts'  structures  and  the  absence  of  an  estimate  of  the 
distribution  of  the  states  for  a  population  of  students.  The  lack  of  agreement 
between  the  experts  suggests  that  some  of  the  experts  difier  concerning  their  ability 
to  perform  the  task.  Villano  (1991)  compared  the  performance  of  the  individu^ 
experts'  knowledge  structures  in  computerized  assessment  routines  and 
demonstrated  a  significant  advantage  in  using  some  experts'  structures  over  others. 

2)  Empirical  Data  >  Villano  (1991)  investigated  various  methods  for  building 
knowledge  structures.  A  refinement  procedure  was  developed  which  involved  the 
application  of  a  probabilistic  model  to  a  large  (N=60,000)  reference  set  of  students. 
Guttman-scale  based  structures  were  built  by  ordering  items  by  increasing  item 
difficulty  to  form  a  single  learning  path  through  the  items.  A  third  method  utilized 
repeated  applications  of  stochastic  assessment  routines  to  determine  the  collection 
of  feasible  states  from  the  power  set. 

3)  Neural  Networks  •  A  novel  application  of  neural  networks  to  construct  a 
knowledge  structuire  has  been  demonstrated  by  Harp,  Samad  and  Villano  (1992). 
Self-organizing  feature  maps  are  used  to  capture  tiie  possible  states  of  student 
knowl^ge  from  an  existing  test  database  in  the  domain  of  aircraft  fuel 
management.  Noise-tolerance  and  insensitivity  to  feature  map  parameter  values 
are  demonstrated. 

Initial  Uncertainty  of  the  Student  Model 

In  the  absence  of  empirical  data  (in  the  form  of  expert  judgments  or  student 
responses)  regarding  die  likelihood  of  knowledge  states  in  a  population  of  students, 
all  die  states  would  need  to  be  considered  equally  likely.  However,  one  of  die 
significant  benefits  of  probabilistic  student  modds  is  die  capability  of  incorporating 
knowledge  about  a  population  of  students  to  improve  the  initial  estimate  of  an 
individual  student  by  the  student  modeL 

A  variety  of  a  priori  probability  distributions  on  the  knowledge  states  have  been 
studied  by  Villano  (1991).  The  following  distributions  were  implemented  and 
evaluated  in  stodiastic  assessment  routines: 

1)  Uniform  Prior  -  all  states  in  the  structure  are  initially  equiprobable. 
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2)  Refined  Prior  -  the  probabilities  of  the  states  are  parameters  estimated 
directly  by  applying  a  probabilistic  model  and  maximtun  likelihood 
techniques  to  a  large  (N=60,000)  reference  set  of  student  responses. 

3)  Assessed  Prior  -  the  probabilities  of  the  states  are  estimated  by  taking  the 
"average"  of  the  final  distributions  resulting  from  the  complete  assessment  of 
a  large  (N=60,000)  reference  set  of  student  responses. 

Falmagne  (personal  communication,  November  1991)  suggests  that  information 
regarding  the  background  of  the  student,  such  as  the  student's  age,  level  of 
education,  prior  training,  etc.  could  also  be  used  to  "prime"  the  a  priori  probability 
distribution.  This  priming  could  be  sunUar  in  spirit  to  the  forward  chaining  that  is 
often  done  at  the  start  of  some  diagnostic  expert  systems. 

Applications  of  the  Student  Model 

Adaptive  Assessment  Item  Selection 

For  an  assessment  routine  to  be  adaptive,  it  must  be  capable  of  determining  the  next 
'^best"  question  to  pose  to  the  student  based  on  a  dynamic  student  model.  In  KST, 
one  method  for  selecting  the  most  "informative"  item  to  ask  is  choose  the  item 
with  the  least  predictable  response  (Falmagne  &  Doignon,  1988;  Villano,  1991).  For 
the  half-split  item  selection  rule,  we  choose  the  item  whose  probability  of  being 
answer^  correctly,  pijf)  is  closest  to  .5.  The  reasoning  is  as  follows.  If  p(a)  =  .85,  th^ 
Item  a  would  not  informative  because  it  is  almost  certain  the  student  would 
respond  correctly.  If  p{d)  *  .1,  then  Item  d  is  not  informative  because  we  are  fairly 
certain  (1-.1«  .9)  that  ^e  student  would  fail  this  item.  If  p(c)  =  .5,  then  Item  c  would 
be  the  most  informative  item  to  ask  of  these  three  because  there  would  be  an  equal 
chance  of  the  student  passing  or  failing  item  c.  Item  c  is  thus  the  item  for  which  our 
estimate  of  how  the  student  will  respond  is  the  most  imcertain.  (If  two  or  more 
items  are  equally  informative,  then  we  randomly  choose  from  among  those  items.) 
There  is  an  entropy-based  rule  in  KST  in  which  we  try  to  select  die  item  which  will 
bring  about  the  greatest  reduction  in  the  entropy  of  the  probability  distribution  on 
the  states,  but  it  has  been  shown  to  be  equivalent  to  die  half-split  question  selection 
rule  imder  certain  conditions  (Falmagne  and  Doignon,  1988). 

Adaptive  Assessment  Updating  Routine 

A  dynamic  studmit  model  would  necessarily  require  periodic  updating  with  each 
new  item  response  obtained  from  the  student.  In  order  to  perform  adaptive 
assessment,  an  updating  rule  must  be  specified  to  maintain  die  current  estimate  of 
the  student's  performance. 

In  the  stochastic  knowledge  assessment  routines  of  Falmagne  and  Doignon 
(1988;  Villano,  1991)  the  probal^ty  distribution  on  the  states  is  maintained  dirough 
an  application  of  an  updating  rule  which  modifies  die  probabilities  of  die  states  to  be 
consistent  with  each  new  response  obtained  from  the  student.  For  example,  if  die 
student  responds  correcdy  to  an  item,  then  die  probability  of  the  states  which 
contain  that  item  are  increased,  while  the  probability  of  the  states  which  do  not 
contain  that  item  are  decreued.  Various  updating  rules  are  possible.  The 
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multiplicative  updating  rule  specifies  an  operator  (greater  than  1)  which  is, used  to 
increase  (by  multiplying  and  then  normalizing)  the  probability  of  the  states 
consistent  with  the  student's  response.  The  larger  the  value  of  the  parameter,  the 
greater  the  change  in  the  distribution.  The  calibration  of  such  a  parameter  has  been 
demonstrated  by  Villano  (1991). 

The  multiplicative  operator  can  be  indexed  by  the  item  asked  and  the  response 
given  (correct  or  incorrect).  Thus,  a  correct  response  to  a  particularly  diagnostic  item 
could  have  a  stronger  effect  on  the  change  in  the  mass  of  the  probability  distribution 
than  some  other  less  diagnostic  item.  (For  example,  items  with  low  probability  of  a 
careless  error  could  have  higher  update  parameters.  The  response  to  these  items 
would  be  judged  to  be  more  reliable  due  to  the  lower  error  rate.)  During  the 
instructional  phase  of  an  ITS,  incorrect  responses  may  be  more  prevalent  as  the 
student  may  be  less  cautious  than  during  testing.  Therefore,  incorrect  responses 
could  have  lower  updating  parameters  set  to  exert  less  influence  on  the  change  in 
the  distribution  during  the  instructional  phase.  The  multiplicative  updating  rule 
can  be  regarded  as  a  generalization  of  a  Bayesian  updating  rule  as  pointed  out  by 
Koppen  in  Falmagne  and  Doignon  (1988). 

Knowledge  Type  Representations 

Both  declarative  and  procedural  knowledge  can  be  represented  and  integrated  in  a 
knowledge  structure.  A  distinction  between  these  two  traditional  knowledge  t3rpes 
may  be  necessary  for  expository  purposes  Gesson  presentation  may  difie.  for 
teaching  declarative  vs.  procedural  Imowledge)  as  well  as  for  testing  formats. 
Procediiral  knowledge  may  be  tested  by  presenting  a  task  for  the  student  to  complete 
and  monitoring  the  student's  performance.  Declarative  knowledge,  although 
implicitly  tested  during  the  completion  of  a  task,  can  be  assessed  directiy  using 
standard  fill-in  or  multiple  choice  questions.  In  order  to  satisfy  these  and  other 
concerns  for  maintaining  the  distinction  between  declarative  and  procedural 
knowledge,  the  complete  knowledge  structure  encompassing  both  can  be  divided  in 
to  two  "substructures"  as  indicated  in  Figure  Z 

Cuniculum  Representation 

Various  learning  paths  through  a  curriculum  may  be  represented  in  a  student 
model  to  accomm^ate  different  instructional  strat^es  of  educators  and  different 
learning  styles  on  the  part  of  students.  If  an  item  hu  more  than  one  unique  set  of 
prerequisites,  then  alternative  paths  tiuough  the  items  should  be  represented.  The 
cunric^um  path  det^nines  the  next  lesson  (associated  with  an  item)  to  teach  in  a 
directed,  as  opposed  to  an  exploratory  US. 

In  KST,  tiie  learning  paths  (called  gradations)  may  be  used  to  guide  the  teaching  of 
die  student  In  a  w^-graded  knowledge  space,  the  next  lesson  to  teadi  is  die  <me 
tested  by  the  next  item  in  the  learning  padu  In  toe  event  duit  there  is  more  than  one 
path  to  follow  from  the  current  knowledge  state,  you  may  dioose  the  "easiest"  item 
(the  iton  with  toe  highest  probability  of  being  answered  correcdy),  or  the  item  along 
die  most  traveled  (i.e.,  most  probable)  learning  path.  Additional  parameters  which 
should  affect  teaching  include  the  history  of  toe  knowledge  state  over  time  and  an 
estimate  of  the  learning  rate  of  the  student  The  learning  rate  of  toe  student  can  be 
estimated  using  toe  stochastic  learning  path  model  specified  by  Falmagne,  (1991b). 
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Knowledge  Structure  with  Declarative  and  Procedural  Knowledge  , 

^  (a)  ^  ^  {A,  a,  b,  c) - {A,  C,  a,  b,  c}  ^ 

0  (a,  b} - {A,  a,  b}  {A,  B,  C,  a,  b,  c,  d} 

{b)  {A,  a,b,  d} — { A,  B,  a,  b,  d} 

Declarative  Knowledge  Substructure 

0  {a,  b}  {a,  b,  c,  d) 

{b}  {a,  b,  d) 


Procedural  Knowledge  Substructure 
^(A,C)\ 

{}  - {A}  {A,B,C} 

Figure  2.  An  example  of  a  knowledge  structure  and  two  of  its  substructures.  The 
d^arative  items  appear  as  small  letters.  The  capital  letters  denote  procedural  items. 


If  all  the  parameters  of  the  model  have  been  estimated  for  a  population  of  students, 
then  we  can  re-estimate  the  learning  rate  parameter  for  a  particular  student.  If  we 
observe  n  patterns  of  responses  at  different  times,  ti,  t2,  ...tn,  then  we  can  estimate 
the  students'  learning  rate  by  maximizing  the  likelihood  of  the  student's  learning 
rate  parameter  at  time  tn. 

Hint  Level  Selection 

The  content  of  a  hint  or  explanation  in  an  ITS  relies  upon  the  student  model's 
representation  of  the  student's  level  of  mastery  of  the  material.  Advice  or  help  to  be 
presented  to  the  student  should  be  tailored  to  the  individual  student's  needs. 
Advanced  students  may  prefer  to  be  given  more  terse  explanations,  whereas  novice 
students  could  be  given  more  elaborate  guidance.  The  coaching  which  a  student 
receives  from  the  ITS  should  be  careful  only  to  indude  references  to  concepts  which 
the  student  model  indicates  as  having  been  mastered. 

The  level  of  explanation  or  hints  in  KST  could  depend  U}  ou  whether  the 
student  is  being  assessed  at  a  coarse  substructure  or  at  a  finer,  more  ^.^gnostic  level. 
The  hdght  of  an  item  in  the  knowledge  struchire  is  a  rough  measure  of  item 
difficulty  and  could  also  be  used  to  determine  the  level  of  hinting.  The  height  of  an 
item  hiq)  is  defined  as  the  smallest  number  of  items  which  must  be  master^  before 
g.  (Kambouri  e*  al.,  1991).  In  the  example  knowledge  space  in  Figure  1,  Ha)  s  0  and 
hid)  «  2.  (There  are  no  items  which  must  be  mastered  before  Item  a  and  at  least  2 
items  (a,  ft  or  a,  c)  which  must  be  mastered  before  Item  d.  If  the  items  span  a  wide 
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range  of  heights,  then  each  level  of  help  could  be  associated  with  a  particular  height 
interval.  For  example,  if  we  wish  to  offer  3  levels  of  help  in  an  ITS  (beginner, 
novice,  advanced)  and  the  heights  for  20  items  range  from  0  to  12,  then  Table  1 
shows  one  possible  mapping  of  the  item  height  to  the  hint  level. 


Table  1 

De^inition^o^^^JM£^J^£DelJrm^^ 


Item  Height 

Help  Level 

0to3 

beginner 

4  to  8 

novice 

9  to  12 

advanced 

An  item  parameter  such  as  the  probability  of  a  careless  error  may  also  influence 
hinting.  For  example,  if  an  item  had  a  relatively  high  probability  of  a  careless  error,  a 
hint  might  warn  the  student  to  take  extra  time  to  check  and  confirm  the  answer  to 
the  item. 

Advancement  Criterion 

A  student's  advancement  through  the  curriculum  may  need  to  be  directed  or  at  least 
monitored  by  the  ITS,  particularly  in  domains  such  as  mathematics,  where 
advanced  concepts  will  not  be  easily  mastered  widiout  a  strong  understanding  of 
fundamental  principles  or  prerequisites.  (A  debate  comparing  directed  vs. 
exploratory  learning  is  well  beyond  the  scope  of  this  paper.) 

The  student  would  be  expected  to  master  the  current  item  in  the  learning  path 
before  moving  on  to  more  advanced  items.  If  this  constraint  is  relaxed,  a  criterion 
coidd  be  spewed  to  control  advancement.  Thiis,  mastery  of  an  item  could  be 
defined  by  a  score  on  equivalence  dass  of  items.  In  addition,  a  minimum  number  of 
instances  of  an  item  may  be  required  to  which  the  student  must  respond. 

Student  Feedback 

An  inspectable,  detailed  representation  of  the  learner's  mastery  of  the  material  can 
provide  feedback  regarding  the  student's  most  recent  accomplishments  and  most 
pressing  weaknesses. 

In  KST,  rather  than  reporting  a  single  score  (ie.,  ability  >  95%),  we  can  be  much 
ihore  spedfic  and  indicate  the  most  advanced  item  that  has  been  mastered  as  wdl  as 
a  list  of  the  missing  items  and/or  future  items  to  be  mastered.  If  a  single  score  is 
preferred,  we  should  not  just  use  'l)lind"  avo'aging  of  the  scores  on  tiie  items,  but 
rather  taike  advantage  of  diagnostic  information  for  the  specific  items.  For  example, 
weighting  the  average  score  by  the  heights  of  tiie  items.  Icleally,  we  would  prefer  not 
to  lose  ^  distinction  between  a  student  who  can  answer  many  simple  items  versus 
one  who  can  answer  a  few  difficult  items.  A  proposed  project  that  has  been  on  the 
table  for  a  number  of  years  in  Falmagne's  lab  (personsd  communication)  involves 
the  generation  of  diagnostic  dialogue  once  a  student's  knowledge  state  has  been 
isolated.  The  following  issues  would  need  to  be  addressed  in  the  development  of 
such  a  system: 
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1)  an  analysis  of  the  items  in  terms  of  skills  or  concepts  should  be  performed. 

2)  quantitative  aspects  of  skill  should  be  translated  into  linguistic  terms. 

3)  degrees  of  doubt  regarding  the  diagnosis  should  be  expressed. 

"Studa\t  probably  knows  multiplication,  but  may  be  weak  in  percents." 

4)  broad  issues  in  generating  discourse  would  need  to  be  addressed. 

Evaluation  of  a  Probabilistic  Student  Model 

An  important  consideration  for  utilizing  probabilistic  student  models  in  an  ITS  is 
the  ability  to  quantitatively  evaluate  their  effectiveness  using  established  statistical 
techniques  on  simulated  and  real  student  data.  Some  points  to  consider  when 
evaluating  student  models  are  given  below: 

Error  sensitivity.  The  responsiveness  of  the  student  model  to  careless  errors  or  lucky 
guesses  on  the  part  of  the  student  is  a  critical  feature  of  probabilistic  student  models 
and  should  be  carefxilly  studied. 

Parameter  sensitivity.  How  critical  to  the  success  of  the  student  model  are  the  initial 
parameter  estimates?  The  importance  of  the  estimates  of  the  a  priori  probability 
distribution  on  the  states  and  the  item  parameters  has  to  be  reviewed. 

Efficiency.  Ideally,  a  student  model  should  minimize  the  number  of  questions  that 
are  necessary  to  obtain  an  accurate  assessment  of  the  student.  The  cost  of  asking 
additional  items  should  be  measured  against  any  inaease  in  assessment  accuracy. 

Learning  rate.  How  quickly  the  student  model  converges  to  an  accurate  estimate  of 
the  state  of  the  student's  Imowledge  is  also  of  interest  and  related  to  the  issue  of 
efficiency. 

Assessment  accuracy.  Many  of  the  above  considerations  rdy  upon  some  measure  of 
the  quality  of  the  assessment.  One  such  measure  involves  computing  a  prediction 
index  (Villano,  1991)  which  represents  the  prc^rtion  of  student  responses  correctly 
predicted  for  items  which  have  not  yet  been  adced  during  the  assessment. 


Discussion 

Knowledge  Space  Theory  was  developed  to  conduct  effidmit,  computerized  stud^t 
assessments  and  titerefore  may  be  a  viable  choice  as  a  probabilistic  student  model.  A 
variety  of  techniques  have  been  investigated  for  building  the  structural  and 
probabilistic  components  of  the  student  modd  in  KST.  A  realistic  concern  regarding 
the  implementation  of  knowledge  spaces  is  the  possible  combinatorial  explosion  in 
terms  of  the  size  of  a  knowledge  space  if  thm’e  is  a  significant  lack  of  structure 
among  the  items.  The  size  of  a  knowledge  structure  may  not  be  an  important 
consideration  with  the  rapid  increases  in  the  power  and  storage  capacity  of  modem 
computers.  A  large  number  of  research  issues  remain  to  be  explored  in  regards  to 
applying  probabilistic  student  models  to  an  Intelligent  Tutoring  System.  However, 
the  goal  of  developing  a  dynamic,  non-deterministic  student  model  capable  of 
robust,  individualize  assessment  may  be  wdl  worth  the  cost. 
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